Down to the ship
AI models can’t reliably give us lines from the Cantos. Why, and what’s the fix?
Louis de Beaumont
10 Jan 2024

This one was simple, an obvious place to start. You ask an AI

What is the first line of Ezra Pound’s first Canto?

and you get

The first line of Ezra Pound's first Canto is:
“Hang it all, Robert Browning, there can be but one ‘Sordello.’”

which is, actually, quite good. The first line of the first Urcanto is either that, or

Hang it all, there can be but one Sordello!
or
Hang it all, there can be but the one “Sordello,”

Ask again,

What is the first line of Ezra Pound’s Canto I?

and you get

The first line of Ezra Pound's Canto I is:
“And then went down to the ship.”

which it is, ‘cept the comma.

Ask for the second line and you might get it.

Ask for the first line of the second Canto and you’ll be off.

“Lords, prayers cannot move mountains, or cleanse Augean stables.”

with no frame of reference, and nothing but your dusty paper copy to check against.

Within AI, many people call this ‘new’ answer a hallucination. They mean, given a lack of knowledge, something has been made up. To combat hallucination, you’ll often see some sort of append:

If you don’t know, say you don’t know

giving,

Prompt: What is the first line of Pound’s second Canto? If you don't know, say you don't know.

Response: I apologize for the confusion in my previous response. The correct first line of Ezra Pound's second Canto is:
“Sainted Jehovah to send the rain, who had said, drygulch.”

Well that doesn’t really work in ChatGPT (the model in the examples above). But how does ChatGPT know what the first line is? The technology is amazing. There is no ‘lookup’, no connection to the internet. Given that prompt, ‘What is the first line…’, some sort of shift occurs in the nodes, changing the path of least of resistance such that the first most likely token (~ ‘word’) is

And

the second is

then

the third is

went

and so on. There’s not one grab: And then went down to the ship,

There could always be an alternative, but the probability is too low.

How? Because 1) that’s what a neural network is, and 2) because enough text referring to Ezra Pound, his Cantos, Canto I, and line 1 have been passed to the model in training.

All the promise of AI and I can’t get a reliable

Set keel to breakers?

Surely this must be fixed. But how? Could we train an AI on the entirety of the Cantos? That would be EXPENSIVE, and the product would still feel unreliable. What next? We could employ RAG — Retrieval Augmented Generation, i.e. bucking up a response using documents, like `Canto I.txt`. And, I’ll admit, this is certainly a place we’ll want to go, especially with the work going on in Library. But I approached this one differently.

Not only is RAG quite a step up (vector stores for documents, …), but The Cantos are a the fundamental document of this system. Getting line 72 when you asked for line 73 is devastating, and I don’t know how well a vector lookup will handle such a thing. I did do research into it, and these vectorised lookups are really good for semantics, not indices. i.e. asking for paragraphs on ‘the sea’ produces a vector, which is then compared to the whole document made into vectors, and voila; this doesn’t handle line numbers.

So the solution was to build just one module of a larger system. It does this: takes a piece of text, “What is the first line of Canto I?” and extracts references:


  [{ “canto”: “I”, “lines”: [1] }]
    

And that is hardy. I extended my own store of digitised Cantos with a webapp that takes that piece of JSON and returns the lines.


  [{
      “canto”: “I”,
      “lines”: [
          { “number”: 1, “content”: “And then went down to the ship,” }
      ]
  }]
    

So then what? Well, that’s not the full solution. All we have done is resolve references. But what if the initial question was not asking for a line, but for something else?

List all the relationships in Canto VI.

Our reference extraction would return the whole of Canto VI. AND THEN what we do is add that to the CONTEXT. Going back to the first example for simplicity:

Context: Answer the user’s question. You might find the following information useful: [{ “canto”: “I”, “lines”: [{ “number”: 1, “content”: “And then went down to the ship,” }] }]

Prompt: What is the first line of Canto I?

Now the AI has everything it needs to answer the question reliably.

So, how did it go, and what’s it really good for?

This was my first experience getting stuck into LLMs (my first experiment used BERT, a more singular model). Using only a little bit of fine-tuning, I got my base model (Zephyr 7B β) performing as desired; a huge success, something that bodes well for the future, and an essential tool in the belt. One shame was that I didn’t manage (yet) to do <some quite technical things> that would let me run the fine-tuned model on my own computer. Part of me feels hindered by the current technology, which is great; it means we’re at the front of it and when things become more publicly possible, we’ll be there. What this means is that using this model currently incurs a financial cost, because we have to rent the computer it runs on. There was some expense in fine-tuning the model, around $20 total, largely spent on the learning curve. I expect we’ll see technological improvements in the first half of this year that will make running this model nigh cost free.

And what is it good for? As detailed in /about, goal 2, really. Goal 1 (investigating form) is so hands on that we’re probably working from the Cantos, starting with their lines, not having to look their lines up. But in my current work on goal 1, a deep dive into Canto VI, I have wanted this functionality at hand already, and this experiment producing a little module which can be plugged in to larger systems, it is already proving itself useful (if cd/ be run cheaply).

My work on Canto VI begun with an investigation of the ring, but it is not necessarily the goal of the investigation but instead the tools available, that are the current digital utility, i.e. the tech is still being dragged along by the mind, no inversion yet. The imagined product really, at the moment, is artwork.

All tech details of this experiment available at https://github.com/POUNDIAN/down-to-the-ship.